Arabic phonology
While many languages have numerous dialects that differ in phonology, the contemporary spoken Arabic language is more properly described as a continuum of varieties. This article deals primarily with Modern Standard Arabic (MSA), which is the standard variety shared by educated speakers throughout Arabic-speaking regions. MSA is used in writing in formal print media and orally in newscasts, speeches and formal declarations of numerous types. Arabic has 28 consonant phonemes, making phonemic contrasts between "emphatic" (pharyngealized or velarized) consonants and non-emphatic ones; Arabic also has three vowel phonemes. However, by the 8th century, the letter alif no longer represented a glottal stop only, but also a long . As a result, a diacritic symbol, hamza ( ), was introduced to represent this sound with alif, and a hamza can be used, separately, now without the letter alif, to indicate the sound. In addition, some of these phonemes have coalesced in the various modern dialects, while new phonemes have been introduced through borrowing or phonemic splits. A "phonemic quality of length" applies to consonants as well as vowels. Vowels There are only three short vowels, three long vowels and two diphthongs (formed by a combination of short with the semivowels and ) in classic Arabic with no allophones. Allophony in different dialects of Arabic can occur, and is partially conditioned by neighboring consonants within the same word. As a general rule, for example, and are: * ** retracted to in the environment of a neighboring , or an emphatic (pharyngealized) consonant: , , , , and in a few regional standard pronunciations also and ; ** only in Iraq and Persian Gulf: before a word boundary; ** advanced to in the environment of most consonants: *** labial consonants ( , and ), *** plain (non-pharyngealized) coronal consonants with the exception of ( , , , , , , , , and ) *** pharyngeal consonants ( and ) *** glottal consonants ( and ) *** , and ; ** Across North Africa and West Asia, the open vowel may have different contrasting values, being ( , ), ( , ) or without any contrast at all: almost centralized . ** In northwestern Africa, the (near-)open front vowel is raised to or . * ** Across North Africa and West Asia, may have other values: ( or ) and may have other values: ( or ). Sometimes with one value for each vowel in both short and long lengths or two different values for each short and long lengths. ** In Egypt, close vowels have different values; short initial or medial: , ← instead of . Unstressed final long are most often shortened or reduced: → or , → , → . However, the actual rules governing vowel-retraction are a good deal more complex, and have relatively little in the way of an agreed-upon standard, as there are often competing notions of what constitutes a "prestige" form. Often, even highly proficient speakers will import the vowel-retraction rules from their native dialects. Thus, for example, in the Arabic of someone from Cairo emphatic consonants will affect every vowel between word boundaries, whereas certain Saudi speakers exhibit emphasis only on the vowels adjacent to an emphatic consonant. Certain speakers (most notably Levantine speakers) exhibit a degree of asymmetry in leftward vs. rightward spread of vowel-retraction. The final heavy syllable of a root is stressed. However, the pronunciation of loanwords is highly dependent on the speaker's native variety. The vowels , , and appear in varieties of Arabic and some stable loanwords or foreign names.Elementary Modern Standard Arabic: Volume 1, by Peter F. Abboud (Editor), Ernest N. McCarus (Editor) E.g. ('Coca-Cola'), ('chocolate'), or ('doctor'), ('John'), ('Tom'), ('Belgium'), or ('secretary'), etc. Foreign words often have a liberal sprinkling of long vowels, as their word shapes do not conform to standardized prescriptive pronunciations written by letters for short vowels.Teach Yourself Arabic, by Jack Smart (Author), Frances Altorfer (Author) For short vowels and , there may be no vowel letter written, as is normally done in Arabic (unless they are at the beginning of a word), or long vowel letters (for ) or (for ) are used. The letters or are always used to render the long vowels and . Consonants Even in the most formal of conventions, pronunciation depends upon a speaker's background. Nevertheless, the number and phonetic character of most of the 28 consonants has a broad degree of regularity among Arabic-speaking regions. Note that Arabic is particularly rich in uvular, pharyngeal, and pharyngealized ("emphatic") sounds. The emphatic coronals ( , , , and ) cause assimilation of emphasis to adjacent non-emphatic coronal consonants. # The phoneme represented by the Arabic letter ( ) has many standard pronunciations: in most of Egypt and some regions in Yemen and Oman. This is also a characteristic of colloquial Egyptian and southern Yemeni dialects. In Morocco and Algeria, it is pronounced as in some words, especially colloquially. In most north Africa and most of the Levant, the standard is pronounced , and in certain regions of the Persian Gulf with , while ~ in Literary Arabic. In some Sudanese and Yemeni dialects, it may be either or as it used to be in Classical Arabic. Foreign words containing may be transcribed with , , , , , or , mainly depending on the regional spoken variety of Arabic or the commonly diacriticized Arabic letter . Also, can be used in loanwords where it isn't the standard pronunciation for the letter ( ). # Emphatic consonants are pronounced with the back of the tongue approaching the pharynx (see pharyngealization). They are pronounced with velarization by the Iraqi and Persian Gulf speakers. , , and can be considered the emphatic counterparts to , , and respectively. # The so-called "voiced pharyngeal fricative" ( ) is in fact neither pharyngeal nor fricative, but is more correctly described as a creaky-voiced epiglottal approximant. Its unvoiced counterpart ( ) is likewise epiglottal, although it is a true fricative. Thelwall asserts that the sound of is actually a pharyngealized glottal stop . , citing , , and . Similarly, points to dialectal and idiolectal variation between stop and continuant variations of in Iraq and Kuwait, noting that the distinction is superficial for Arabic speakers and carries "no phonological consequences." # and are pronounced as and , respectively, in Iraq and Arabian Peninsula excluding Jordan and Syria. # In most regions, uvular fricatives of the classical period have become velar or post-velar. # In most pronunciations, as a phoneme occurs in a handful of loanwords. It also occurs in , the name of God, q.e. Allah; except when it follows long or short when it is not emphatic: ("in the name of God"). However, is absent in many places, such as Egypt, and is more widespread in certain dialects, such as Iraqi, where the uvulars have velarized surrounding instances of in certain environments. also assumes phonemic status more commonly in pronunciations influenced by such dialects. Furthermore, also occurs as an allophone of in the environment of emphatic consonants when the two are not separated by . # Emphatic exists all over North African pronunciations. The trill /r/ is sometimes reduced to a single vibration when single, but it remains potentially a trill, not a flap ɾ, single trill is pronounced between trill r and flap ɾ}}. # are not necessarily pronounced by all Arabic speakers, but are often pronounced in names and loanwords. Foreign sounds , are usually transcribed as and , respectively. In some words, they are pronounced as in the original language ( and ), e.g. or "Pakistan", or "virus", etc. Sometimes the Persian letter (with 3 dots) and a modified letter are used for this purpose. As these letters are not present on standard keyboards, they are simply written with and , e.g. both and , or "November", both and "caprice" can be used.Hans Wehr, Dictionary of Modern Written Arabic (transl. of Arabisches Wörterbuch für die Schriftsprache der Gegenwart, 1952) The use of both sounds may be considered marginal and Arabs may pronounce the words interchangeably; besides, many loanwords have become Arabized. # Depending on the region, the plosives are either alveolar or dental. Long (geminate or double) consonants are pronounced exactly like short consonants, but last longer. In Arabic, they are called mushaddadah ("strengthened", marked with a shaddah), but they are not actually pronounced any "stronger". Between a long consonant and a pause, an epenthetic occurs, but this is only common across regions in West Asia. Phonotactics Arabic syllable structure can be summarized as follows, in which parentheses enclose optional components: * (C1) (S1) V (S2) (C2 (C3)) Arabic syllable structure consists of an optional syllable onset, consisting of one or two consonants; an obligatory syllable nucleus, consisting of a vowel optionally preceded by and/or followed by a semivowel; and an optional syllable coda, consisting of one or two consonants. The following restrictions apply: * Onset ** First consonant (C1): Can be any consonant, including a liquid ( ). (Onset is composed only of one consonant; consonant clusters are only found in loanwords, sometimes an epenthetic /a/ is inserted between consonants.) * Nucleus ** Semivowel (S1) ** Vowel (V) ** Semivowel (S2) * Coda ** First consonant (C2): Can be any consonant. ** Second consonant (C3): Can also be any consonant. Local variations Spoken varieties differ from Classical Arabic and Modern Standard Arabic not only in grammar but also in pronunciation. Outside of the Arabian peninsula, a major linguistic division is between sedentary varieties, largely urban varieties. Inside the Arabian peninsula and in Iraq, the two types are less distinct; but the language of the urbanized Hijaz, at least, strongly looks like a conservative sedentary variety. Some examples of variation: ;Consonants * The phoneme : the word "golf" may be spelled (mainly in Egypt), and (mainly in the Levant and Iraq), (mainly in the Arabian Peninsula), (in Morocco) or (in West Asia). The Standard Arabic and dialects did not have the letters V and P but they are pronounced by using the Arabic F for V and Arabic B for P and , largely from loanwords as in (Volvo) and (seven-ap 'Seven-Up'). is another possible loanword phoneme, as in the word (sandawitsh 'sandwich'), though a number of varieties instead break up the and sounds with an epenthetic vowel. , citing and Cairene as examples with and without this phoneme, respectively. Egyptian Arabic treats as two consonants ( ) and inserts , as teʃ}}C or etʃ}}, when it occurs before or after another consonant. is found as normal in Iraqi Arabic and Gulf Arabic.Gulf Arabic Sounds Persian character is used for writing tʃ}}. Otherwise Arabic usually substitutes other letters in the transliteration of names and loanwords, normally the combination (tā’-shīn) is used to transliterate the . * Highly varied realizations of the original velar stops (especially Classical < and < * ; to a lesser extent, ). is frequently voiced to , debuccalized to or fronted to ; palatalized pronunciations are sometimes seen, as in the name of the city of Sharjah. is frequently softened to or palatalized to , but appears as in most of Egypt. is frequently palatalized to in Iraq and the Persian Gulf. * Split of original into two phonemes, distinguished primarily by how they affect neighboring vowels. This has progressed the farthest in North Africa. See Moroccan Arabic, Algerian Arabic, Tunisian Arabic and Libyan Arabic * Loss of the glottal stop in places where it is historically attested, as in → . ; Vowels * Development of highly distinctive allophones of and , with highly fronted , or in non-emphatic contexts, and retracted in emphatic contexts. The more extreme distinctions are characteristic of sedentary varieties, while Bedouin and conservative Arabian-peninsula varieties have much closer allophones. In some of the sedentary varieties, the allophones are gradually splitting into new phonemes under the influence of loanwords, where the allophone closest in sound to the source-language vowel often appears regardless of the presence or absence of nearby emphatic consonants. * Spread of "emphasis", visible in the backing of phonemic . In conservative varieties of the Arabic peninsula, only adjacent to emphatic consonants is affected, while in Cairo, an emphatic consonant anywhere in a word tends to trigger emphatic allophones throughout the entire word. Dialects of the Levant are somewhere in between. Moroccan Arabic is unusual in that and have clear emphatic allophones as well (typically lowered, e.g. to and ). * Monophthongization of diphthongs such as and to and , respectively ( and in parts of the Maghrib, such as in Moroccan Arabic). Mid vowels may also be present in loanwords such as (Melbórn Melbourne), ( or '(male) secretary') and ( or , 'doctor'). * Raising of word final to . In some parts of Levant, also word-medial to . See Lebanese Arabic. * Loss of final short vowels (with sometimes remaining), and shortening of final long vowels. This triggered the loss of most Classical Arabic case and mood distinctions. * Collapse and deletion of short vowels. In many varieties, such as North Mesopotamian, many Levantine dialects, many Bedouin dialects of the Maghrib, and Mauritanian, short and have collapsed to schwa and exhibit very little distinction so that such dialects have two short vowels, and . Many Levantine dialects show partial collapse of and , which appear as such only in the next-to-last phoneme of a word (i.e. followed by a single word-final consonant), and merge to elsewhere. A number of dialects that still allow three short vowels in all positions, such as Egyptian Arabic, nevertheless show little functional contrast between and as a result of past sound changes converting one sound into the other. Arabic varieties everywhere have a tendency to delete short vowels (especially other than ) in many phonological contexts. When combined with the operation of inflectional morphology, disallowed consonant clusters often result, which are broken up by epenthetic short vowels, automatically inserted by phonological rules. In these respects (as in many others), Moroccan Arabic has the most extreme changes, with all three short vowels , , collapsing to a schwa , which is then deleted in nearly all contexts. This variety, in fact, has essentially lost the quantitative distinction between short and long vowels in favor of a new qualitative distinction between unstable "reduced" vowels (especially ) and stable, half-long "full" vowels , , (the reflexes of original long vowels). Classical Arabic words borrowed into Moroccan Arabic are pronounced entirely with "full" vowels regardless of the length of the original vowel. Cairene The Arabic of Cairo (often called "Egyptian Arabic" or more correctly "Cairene Arabic") is a typical sedentary variety and a de facto standard variety among certain segments of the Arabic-speaking population, due to the dominance of Egyptian movies. Cairene Arabic has emphatic labials and and emphatic with marginal phonemic status. Cairene has also merged the interdental consonants with the dental plosives (e.g. → , 'three') except in loanwords from Classical Arabic where they are nativized as sibilant fricatives (e.g. → , 'secondary school'). Cairene speakers pronounce as and debuccalized to (again, loanwords from Classical Arabic have reintroduced the earlier sound or approximated to with the front vowel around it changed to the back vowel ). Classical Arabic diphthongs and became realized as and respectively. Still, Egyptian Arabic sometimes has minimal pairs like ('carrying' f.s.) vs ('burden'). 'pocket' + 'our' → collapsing with which means ('cheese' or 'our pocket'), because Cairene phonology can't have long vowels before two consonants. Cairene also has as a marginal phoneme from loanwords from languages other than Classical Arabic. Sanaa Varieties such as that of Sanaa, Yemen, are more conservative and retain most phonemic contrasts of Classical Arabic. Sanaani possesses as a reflex of Classical (which still functions as an emphatic consonant). In unstressed syllables, Sanaani short vowels may be reduced to . is voiced to in initial and intervocalic positions. Morocco Of all the mainstream varieties of Arabic, Moroccan Arabic is likely the one that has diverged the most from Classical Arabic, similarly to the position of French in the Romance languages and English among the Germanic languages. As described above, Moroccan has heavily innovated in its vowel phonology, under heavy Berber influence. Short vowels and merged into . More recently, most instances of short have also merged into ; the few that remain are mostly in the vicinity of velars and uvulars, which suggests an alternative analysis with phonemically rounded consonants (e.g. labiovelars) and only one short vowel . This schwa, in turn, is phonemically deleted in all contexts except directly followed by a single word-final consonant or in some three-consonant words of the shape CəCC. This inevitably results in some very long, complex consonant clusters, which (unlike most other Arabic varieties) Moroccan Arabic is remarkably tolerant of, only tending to insert epenthetic schwas to break up the clusters at a slow rate of speech. Unlike in other varieties, doubled consonants are never reduced, but are pronounced clearly whether occurring at the beginning of a word, end of a word, between vowels or before or after a consonant. With the collapse of short vowels, speakers no longer perceive a long vs. short distinction in vowels, which has been replaced with a "full" vs. "reduced/unstable" distinction. "Full" vowels (actually pronounced half-long) substitute for both the long and short vowels of Classical Arabic in borrowings; as a result, these borrowings can be immediately identified by their phonology. A number of other unique or unusual developments have taken place. Stress is, for the most part, not detectable at all; to the extent stressed syllables can be identified, there is often no consistent pattern governing which syllable is stressed. Original has split into two phonemes and , reflecting the origin of Moroccan Arabic as a mixture of a sedentary and Bedouin dialect. Original diphthongs , have merged into full vowels , rather than generating new vowel qualities; but "long diphthongs" , also exist, best analyzed as a combination of full vowel and semivowel. Unlike most other varieties, emphasis not only triggers front/back allophones in , but also high/low allophones in and , and varies between non-emphatic , emphatic , and pharyngeal-environment . On the other hand, emphasis spreads only as far as the first full vowel in either direction, unlike in most sedentary varieties where emphasis can spread much more widely, sometimes throughout the entire word. For the purposes of emphasis, splits completely into non-emphatic and emphatic , distinguishable mostly by their effects on adjacent vowels; with very few exceptions, the choice of one or another is consistent across all words derived from a given root. Most emphatic/nonemphatic pairs behave similarly, but is affricated while is non-affricated , so it is always possible to distinguish the two without recourse to their effects on surrounding vowels. Distribution The most frequent consonant phoneme is , the rarest is . The frequency distribution of the 28 consonant phonemes, based on the 2,967 triliteral roots listed by Wehr is (with the percentage of roots in which each phoneme occurs): This distribution does not necessarily reflect the actual frequency of occurrence of the phonemes in speech, since pronouns, prepositions and suffixes are not taken into account, and the roots themselves will occur with varying frequency. In particular, occurs in several extremely common affixes (occurring in the marker for second-person or feminine third-person as a prefix, the marker for first-person or feminine third-person as a suffix, and as the second element of Forms VIII and X as an infix) despite being fifth from last on Wehr's list. The list does give, however, an idea of which phonemes are more marginal than others. Note that the five least frequent letters are among the six letters added to those inherited from the Phoenician alphabet, namely, , , , , and . Sample The Literary Arabic sample text is a reading of The North Wind and the Sun by a speaker who was born in Safed, lived and was educated in Beirut from age 8 to 15, subsequently studied and taught in Damascus, studied phonetics in Scotland and since then has resided in Scotland and Kuwait. Normal orthographic version Diacriticized orthographic version }} Phonemic transcription wa ʔið bi musaːfirin jatˤluʕu mutalafːiʕan bi ʕabaːʔatin samijka fa tːafaqataː ʕalaː ʕtibaːri sːaːbiqi fij ʔid͡ʒbaːri lmusaːfiri ʕalaː xalʕi ʕabaːʔatihi lʔaqwaː ʕasˤafat rijħu ʃːamaːli bi ʔaqsˤaː maː statˤaːʕat min quwːa wa laːkin kulːamaː zdaːda lʕasˤf izdaːda lmusaːfiru tadaθːuran biʕabaːʔatih ʔilaː ʔan ʔusqitˤa fij jadi rːijħ fa taxalːat ʕan muħaːwalatihaː baʕdaʔiðin satˤaʕati ʃːamsu bidifʔihaː fa maːkaːna mina lmusaːfiri ʔilːaː ʔan xalaʕa ʕabaːʔatahu ʕalaː tːawː wa haːkaðaː dˤtˤurat rijħu ʃːamaːli ʔilaː lʔiʕtiraːfi biʔanːa ʃːamsa kaːnat hija lʔaqwaː/}} Phonemic transcription 2 wa ʔið bi musaːfir jatˤlaʕ mutalaffiʕan bi ʕabaːʔa samiːka fat tafaqataː ʕala ʕtibaːr is saːbiq fiː ʔidʒbaːr al musaːfir ʕalaː xalʕI ʕabaːʔatih al ʔaqwaː ʕasˤafat riːħ aʃ ʃamaːl bi ʔaqsˤaː ma statˤaːʕat min quwwa wa laːkin kullama zdaːd al ʕasˤf izdaːd al musaːfir tadaθθuran bi ʕabaːʔatih ʔilaː ʔan ʔusqitˤ fiː jad ar riːħ fa taxallat ʕan muħaːwalatihaː baʕdaʔiðin satˤaʕat aʃ ʃamsI bi difʔihaː fa maː kaːn min al musaːfir ʔillaː ʔan xalaʕa ʕabaːʔatah ʕala ttaww wa haːkaða dˤtˤurrat riːħ aʃ ʃamaːl ʔila lʔiʕtiraːf biʔann aʃ ʃamsI kaːnat hija lʔaqwaː/}} Phonetic transcription (Egypt) wæ ʔɪð bi mʊˈsæːfeɾ ˈjɑtˤlɑʕ mʊtæˈlæf.feʕ bi ʕæˈbæːʔæ sæˈmiːkæ fæt tæfɑqɑˈtæː ˈʕælæ ʕ.teˈbɑːɾ ɪs ˈsɑːbeq fiː ʔeɡbɑːɾ æl mʊˈsæːfeɾ ˈʕælæ ˈxælʕe ʕæbæːˈʔæt(i)hi lˈʔɑqwɑː ˈʕɑsˤɑfɑt ɾiːħ æʃ ʃæˈmæːl bi ˈʔɑqsˤɑ mæ stæˈtˤɑːʕɑt mɪn ˈqow.wɑ wæ ˈlæːkɪn kʊlˈlæmæ zˈdæːd æl ʕɑsˤf ɪzˈdæːd æl mʊˈsæːfeɾ tædæθˈθʊɾæn bi ʕæbæːˈʔætih ˈʔilæ ʔæn ˈʔosqetˤ fiː jæd æɾˈɾiːħ fæ tæˈxæl.læt ʕæn mʊħæːwæˈlæt(i)hæ bæʕdæˈʔiðin ˈsɑtˤɑʕɑt æʃ ˈʃæm.se bi dɪfˈʔihæ fæ mæː kæːn mɪn æl mʊˈsæːfeɾ ˈʔil.læ ʔæn ˈxælæʕ ʕæbæːˈʔætæh ʕælætˈtæw wæ hæːˈkæðæ tˈtˤoɾ.ɾɑt ɾiːħ æʃ ʃæˈmæːl ˈʔilæ lʔeʕteˈɾɑːf biˈʔænn æʃ ˈʃæm.se ˈkæːnæt ˈhɪ.jæ lˈʔɑqwɑ}} ALA-LC transliteration References Bibliography * * * * * *Hans Wehr, (1952) ''Arabisches Wörterbuch für die Schriftsprache der Gegenwart * * * * * * * * * Category:Arabic language Category:Language phonologies